A Novel Algorithm for String Matching with Mismatches

نویسنده

  • Vinod-prasad P.
چکیده

We present an online algorithm to deal with pattern matching in strings. The problem we investigate is commonly known as „string matching with mismatches‟ in which the objective is to report the number of characters that match when a pattern is aligned with every location in the text. The novel method we propose is based on the frequencies of individual characters in the pattern and the text. Given a pattern of length M, and the text of length N, both defined over an alphabet of size σ, the algorithm consumes O(M) space and executes in O(MN/σ) time on the average. The average execution time O(MN/σ) simplifies to O(N) for patterns of size M ≤ σ. The algorithm makes use of simple arrays, which reduces the cost overhead to maintain the complex data structures such as suffix trees or automaton.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On string matching with k mismatches

In this paper we consider several variants of the pattern matching problem. In particular, we investigate the following problems: 1) Pattern matching with k mismatches; 2) Approximate counting of mismatches; and 3) Pattern matching with mismatches. The distance metric used is the Hamming distance. We present some novel algorithms and techniques for solving these problems. Both deterministic and...

متن کامل

A Parallel Algorithm for Fixed-Length Approximate String-Matching with k-mismatches

This paper deals with the approximate string-matching problem with Hamming distance. The approximate string-matching with kmismatches problem is to find all locations at which a query of length m matches a factor of a text of length n with k or fewer mismatches. The approximate string-matching algorithms have both pleasing theoretical features, as well as direct applications, especially in comp...

متن کامل

Reduced Nondeterministic Finite Automata for Approximate String Matching

We will show how to reduce the number of states of nondeterministic nite automata for approximate string matching with k mismatches and nondeterministic nite automata for approximate string matching with k differences in the case when we do not need to know how many mismatches or di erences are in the found string. Also we will show impact of this reduction on Shift-Or based algorithms.

متن کامل

String Matching with Mismatches by Real-Valued FFT

String matching with mismatches is a basic concept of information retrieval with some kinds of approximation. This paper proposes an FFT-based algorithm for the problem of string matching with mismatches, which computes an estimate with accuracy. The algorithm consists of FFT computations for binary vectors which can be computed faster than the computation for vectors of complex numbers. Theref...

متن کامل

On String Matching with Mismatches

In this paper, we consider several variants of the pattern matching with mismatches problem. In particular, given a text T = t1t2 · · · tn and a pattern P = p1p2 · · · pm, we investigate the following problems: (1) pattern matching with mismatches: for every i, 1 ≤ i ≤ n −m + 1 output, the distance between P and titi+1 · · · ti+m−1; and (2) pattern matching with k mismatches: output those posit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016